home *** CD-ROM | disk | FTP | other *** search
Text File | 1995-03-23 | 73.3 KB | 1,539 lines | [TEXT/ROSA] |
- Common Lisp the Language, 2nd Edition
- -------------------------------------------------------------------------------
-
- Appendix A.
- Series
-
- By Richard C. Waters
-
- [change_begin]
- PREFACE: A series is a data structure much like a sequence, with similar kinds
- of operations. The difference is that in many situations, operations on series
- may be composed functionally and yet execute iteratively, without the need to
- construct intermediate series values explicitly. In this manner, series provide
- both the clarity of a functional programming style and the efficiency of an
- iterative programming style.
-
- The remainder of this chapter consists of a description by Richard C. Waters of
- his work on an existing implementation of series. This is the culmination of
- many years of design and use of this approach, during which some 100,000 lines
- of application code have been written (by about half a dozen people over the
- course of seven years) using the series facility in nearly all iteration
- situations. This includes one large system (KBEmacs) of over 40,000 lines of
- code.
-
- I have edited the chapter only very lightly to conform to the overall style of
- this book. Please see the Preface to this book for more information about the
- genesis of the series approach and its relationship to the work of X3J13.
-
- -Guy L. Steele Jr.
-
- [change_end]
- -------------------------------------------------------------------------------
-
- * Introduction
- * Series Functions
- o Scanners
- o Mapping
- o Truncation and Other Simple Transducers
- o Conditional and Other Complex Transducers
- o Collectors
- o Alteration of Series
- * Optimization
- o Basic Restrictions
- o Constraint Cycles
- o Defining New Series Functions
- o Declarations
- * Primitives
-
- -------------------------------------------------------------------------------
-
- A.1. Introduction
-
- [change_begin]
- Series combine aspects of sequences, streams, and loops. Like sequences, series
- represent totally ordered multi-sets. In addition, the series functions have
- the same flavor as the sequence functions-namely, they operate on whole series,
- rather than extracting elements to be processed by other functions. For
- instance, the series expression below computes the sum of the positive elements
- in a list.
-
- (collect-sum (choose-if #'plusp (scan '(1 -2 3 -4)))) => 4
-
- Like streams, series can represent unbounded sets of elements and are supported
- by lazy evaluation: each element of a series is not computed until it is
- needed. For instance, the series expression below returns a list of the first
- five even natural numbers and their sum. The call on scan-range returns a
- series of all the even natural numbers. However, since no elements beyond the
- first five are ever used, no elements beyond the first five are ever computed.
-
- (let ((x (subseries (scan-range :from 0 :by 2) 0 5)))
- (values (collect x) (collect-sum x)))
- => (0 2 4 6 8) and 20
-
- Like sequences and unlike streams, a series is not altered when its elements
- are accessed. For instance, both users of x above receive the same elements.
-
- A totally ordered multi-set of elements can be represented in a loop by the
- successive values of a variable. This is extremely efficient, because it avoids
- the need to store the elements as a group in any kind of data structure. In
- most situations, series expressions achieve this same high level of efficiency,
- because they are automatically transformed into loops before being evaluated or
- compiled. For instance, the first expression above is transformed into a loop
- like the following.
-
- (let ((sum 0))
- (dolist (i '(1 -2 3 -4) sum)
- (when (plusp i) (setq sum (+ sum i))))) => 4
-
- A wide variety of algorithms can be expressed clearly and succinctly with
- series expressions. In particular, at least 90 percent of the loops programmers
- typically write can be replaced by series expressions that are much easier to
- understand and modify, and just as efficient. From this perspective, the key
- feature of series is that they are supported by a rich set of functions. These
- functions more or less correspond to the union of the operations provided by
- the sequence functions, the loop clauses, and the vector operations of APL.
-
- Some series expressions cannot be transformed into loops. This is unfortunate,
- because while transformable series expressions are much more efficient than
- equivalent expressions involving sequences or streams, non-transformable series
- expressions are much less efficient. Whenever a problem comes up that blocks
- the transformation of a series expression, a warning message is issued. On the
- basis of information in the message, it is usually easy to provide an efficient
- fix for the problem (see section A.3).
-
- Fortunately, most series expressions can be transformed into loops. In
- particular, pure expressions (ones that do not store series in variables) can
- always be transformed. As a result, the best approach for programmers to take
- is simply to write series expressions without worrying about transformability.
- When problems come up, they can be ignored (since they cannot lead to the
- computation of incorrect results) or dealt with on an individual basis.
-
- -------------------------------------------------------------------------------
- Implementation note: The series functions and the theory underlying them are
- described in greater detail in [52,53]. These reports also discuss the
- algorithms required to transform series expressions into loops and explain how
- to obtain a portable implementation.
- -------------------------------------------------------------------------------
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.2. Series Functions
-
- [change_begin]
- Throughout this chapter the notation S is used to denote the jth element of
- the series S. As in a list or vector, the first element of a series has the
- subscript zero.
-
- The # macro character syntax #Zlist denotes a series that contains the elements
- of list. This syntax is also used when series are printed.
-
- (choose-if #'symbolp #Z(a 2 b)) => #Z(a b)
-
- Series are self-evaluating objects and the series data type is disjoint from
- all other types.
-
- [Type specifier]
- series element-type
-
- The type specifier (series element-type) denotes the set of series whose
- elements are all members of the type element-type.
-
- [Function]
- series arg &rest args
-
- The function series returns an unbounded series that endlessly repeats the
- values of the arguments. The second example below shows the preferred method
- for constructing a bounded series.
-
- (series 'b 'c) => #Z(b c b c b c ...)
- (scan (list 'a 'b 'c)) => #Z(a b c)
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
- * Scanners
- * Mapping
- * Truncation and Other Simple Transducers
- * Conditional and Other Complex Transducers
- * Collectors
- * Alteration of Series
-
- -------------------------------------------------------------------------------
-
- A.2.1. Scanners
-
- [change_begin]
- Scanners create series outputs based on non-series inputs. Either they operate
- based on some formula (for example, scanning a range of integers) or they
- enumerate the elements in an aggregate data structure (for example, scanning
- the elements in a list or array).
-
- [Function]
- scan-range &key (:start 0) (:by 1) (:type 'number):upto :below :downto :above
- :length
-
- The function scan-range returns a series of numbers starting with the :start
- argument (default integer 0) and counting up by the :by argument (default
- integer 1). The :type argument (default number) is a type specifier indicating
- the type of numbers in the series produced. The :type argument must be a (not
- necessarily proper) subtype of number. The :start and :by arguments must be of
- that type.
-
- One of the last five arguments may be used to specify the kind of end test to
- be used; these are called termination arguments. If :upto is specified,
- counting continues only so long as the numbers generated are less than or equal
- to :upto. If :below is specified, counting continues only so long as the
- numbers generated are less than :below. If :downto is specified, counting
- continues only so long as the numbers generated are greater than or equal to
- :downto. If :above is specified, counting continues only so long as the numbers
- generated are greater than :above. If :length is specified, it must be a
- non-negative integer and the output series has this length.
-
- If none of the termination arguments are specified, the output has unbounded
- length. If more than one termination argument is specified, it is an error.
-
- (scan-range :upto 4) => #Z(0 1 2 3 4)
- (scan-range :from 1 :by -1 :above -4) => #Z(1 0 -1 -2 -3)
- (scan-range :from .5 :by .1 :type 'float) => #Z(.5 .6 .7 ...)
- (scan-range) => #Z(0 1 2 3 4 5 6 ...)
-
- [Function]
- scan sequence
- scan type sequence
-
- scan returns a series containing the elements of sequence in order. The type
- argument is a type specifier indicating the type of sequence to be scanned; it
- must be a (not necessarily proper) subtype of sequence. If type is omitted, it
- defaults to list. (This function exhibits an argument pattern that is unusual
- for Common Lisp: an ``optional'' argument preceding a required argument. This
- pattern cannot be expressed in the usual manner with &optional. It is indicated
- above by two definition lines, showing the two possible argument patterns.)
-
- If the sequence is a list, it must be a proper list ending in nil. Scanning is
- significantly more efficient if it can be determined at compile time whether
- type is a subtype of list or vector and for vectors what the length of the
- vector is.
-
- (scan '(a b c)) => #Z(a b c)
- (scan 'string "BAR") => #Z(#\B #\A #\R)
-
- [Function]
- scan-sublists list
-
- scan-sublists returns a series containing the successive sublists of list. The
- list must be a proper list ending in nil.
-
- (scan-sublists '(a b c)) => #Z((a b c) (b c) (c))
-
- [Function]
- scan-multiple type first-sequence &rest more-sequences
-
- Several sequences can be scanned at once by using several calls on scan. Each
- call on scan will test to see when its sequence runs out of elements and
- execution will stop as soon as any of the sequences are exhausted. Although
- very robust, this approach to scanning can be inefficient. In situations where
- it is known in advance which sequence is the shortest, scan-multiple can be
- used to obtain the same results more rapidly.
-
- scan-multiple is similar to scan except that several sequences can be scanned
- at once. If there are n sequence inputs, scan-multiple returns n series
- containing the elements of these sequences. It must be the case that none of
- the sequence inputs is shorter than the first sequence. All of the output
- series are the same length as the first input sequence. Extra elements in the
- other input sequences are ignored. Using scan-multiple is more efficient than
- using multiple instances of scan, because scan-multiple only has to check for
- the first input running out of elements.
-
- If type is of the form (values ... ), then there must be m sequence inputs
- and the ith sequence must have type . Otherwise there can be any number of
- sequence inputs, each of which must have type type.
-
- (multiple-value-bind (data weights)
- (scan-multiple 'list '(1 6 3 2 8) '(2 3 3 3 2))
- (collect (map-fn t #'* data weights)))
- => (2 18 9 6 16)
-
- [Function]
- scan-lists-of-lists lists-of-lists &optional leaf-test
- scan-lists-of-lists-fringe lists-of-lists &optional leaf-test
-
- The argument lists-of-lists is viewed as a tree where each internal node is a
- non-empty list and the elements of the list are the children of the node.
- scan-lists-of-lists and scan-lists-of-lists-fringe each scan lists-of-lists in
- preorder and return a series of its nodes. scan-lists-of-lists returns every
- node in the tree. scan-lists-of-lists-fringe returns only the leaf nodes.
-
- The scan proceeds as follows. The argument lists-of-lists can be any Lisp
- object. If lists-of-lists is an atom or satisfies the predicate leaf-test (if
- present), it is a leaf node. (The predicate can count on being applied only to
- conses.) Otherwise, lists-of-lists is a (not necessarily proper) list. The
- first element of lists-of-lists is recursively scanned in full, followed by the
- second and so on until a non-cons cdr is encountered. Whether or not this final
- cdr is nil, it is ignored.
-
- (scan-lists-of-lists '((2) (nil)))
- => #Z(((2) (nil)) (2) 2 (nil) nil)
- (scan-lists-of-lists-fringe '((2) (nil))) => #Z(2 nil)
- (scan-lists-of-lists-fringe '((2) (nil))
- #'(lambda (e) (numberp (car e))))
- => #Z((2) nil)
-
- [Function]
- scan-alist a-list &optional (test #'eql)
- scan-plist plist
- scan-hash table
-
- When given an association list, a property list, or a hash table
- (respectively), each of these functions produces two outputs: a series of keys
- K and a series of the corresponding values V. Each key in the input appears
- exactly once in the output, even if it appears more than once in the input.
- (The test argument of scan-alist specifies the equality test between keys; it
- defaults to eql.) The two outputs have the same length. Each V is the value
- returned by the appropriate accessing function (cdr of assoc, getf, or gethash,
- respectively) when given K . scan-alist and scan-plist scan keys in the order
- they appear in the underlying structure. scan-hash scans keys in no particular
- order.
-
- (scan-plist '(a 1 b 3)) => #Z(a b) and #Z(1 3)
- (scan-alist '((a . 1) nil (a . 3) (b . 2)))
- => #Z(a b) and #Z(1 2)
-
- [Function]
- scan-symbols &optional (package *package*)
-
- scan-symbols returns a series, in no particular order, and possibly containing
- duplicates, of the symbols accessible in package (which defaults to the current
- package).
-
- [Function]
- scan-file file-name &optional (reader #'read)
-
- scan-file opens the file named by the string file-name and applies the function
- reader to it repeatedly until the end of the file is reached. Reader must
- accept the standard input function arguments input-stream, eof-error-p, and
- eof-value as its arguments. (For instance, reader can be read,
- read-preserving-white-space, read-line, or read-char.) If omitted, reader
- defaults to read. scan-file returns a series of the values returned by reader,
- up to but not including the value returned when the end of the file is reached.
- The file is correctly closed, even if an abort occurs.
-
- [Function]
- scan-fn type init step &optional test
-
- The higher-order function scan-fn supports the general concept of scanning. The
- type argument is a type specifier indicating the type of values returned by
- init and step. The values type specifier can be used for this argument to
- indicate multiple types; however, type cannot indicate zero values. If type
- indicates m types , then scan-fn returns m series T1, ..., Tm, where Ti has
- the type (series ). The arguments init, step, and test are functions.
-
- The init must be of type (function () (values ... )).
-
- The step must be of type (function ( ... ) (values ... )).
-
- The test (if present) must be of type (function ( ... ) t).
-
- The elements of the Ti are computed as follows:
-
- (values T1 ... Tm ) = (funcall init)
- (values T1 ... Tm ) = (funcall step T1 ... Tm )
-
- The outputs all have the same length. If there is no test, the outputs have
- unbounded length. If there is a test, the outputs consist of the elements up
- to, but not including, the first elements (with index j, say) for which the
- following termination test is not nil.
-
- (funcall test T1 ... Tm )
-
- It is guaranteed that step will not be applied to the elements that pass this
- termination test.
-
- If init, step, or test has side effects when invoked, it can count on being
- called in the order indicated by the equations above, with test called just
- before step on each cycle. However, given the lazy evaluation nature of series,
- these functions will not be called until their outputs are actually used (if
- ever). In addition, no assumptions can be made about the relative order of
- evaluation of these calls with regard to execution in other parts of a given
- series expression. The first example below scans down a list stepping two
- elements at a time. The second example generates two unbounded series: the
- integers counting up from 1 and the sequence of partial sums of the first i
- integers.
-
- (scan-fn t #'(lambda () '(a b c d)) #'cddr #'null)
- => #Z((a b c d) (c d))
-
- (scan-fn '(values integer integer)
- #'(lambda () (values 1 0))
- #'(lambda (i sum) (values (+ i 1) (+ sum i))))
- => #Z(1 2 3 4 ...) and #Z(0 1 3 6 ...)
-
- [Function]
- scan-fn-inclusive type init step test
-
- The higher-order function scan-fn-inclusive is the same as scan-fn except that
- the first set of elements for which test returns a non-null value is included
- in the output. As with scan-fn, it is guaranteed that step will not be applied
- to the elements for which test is non-null.
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.2.2. Mapping
-
- [change_begin]
- By far the most common kind of series operation is mapping. In cognizance of
- this fact, four different ways are provided for specifying mapping: one
- fundamental form (map-fn) and three shorthand forms that are more convenient in
- particular common situations.
-
- [Function]
- map-fn type function &rest series-inputs
-
- The higher-order function map-fn supports the general concept of mapping. The
- type argument is a type specifier indicating the type of values returned by
- function. The values construct can be used to indicate multiple types; however,
- type cannot indicate zero values. If type indicates m types , then map-fn
- returns m series T1, ..., Tm, where Ti has the type (series ). The argument
- function is a function. The remaining arguments (if any) are all series. Let
- these series be S1, ..., Sn and suppose that Si has the type (series ).
-
- The function must be of type
-
- (function ( ... ) (values ... ))
-
- The length of each output is the same as the length of the shortest input. If
- there are no bounded series inputs, the outputs are unbounded. The elements of
- the Ti are the results of applying function to the corresponding elements of
- the series inputs.
-
- (values T1 ... Tm ) == (funcall function S1 ... Sn )
-
- If function has side effects, it can count on being called first on the Si ,
- then on the Si , and so on. However, given the lazy evaluation nature of
- series, function will not be called on any group of input elements until the
- result is actually used (if ever). In addition, no assumptions can be made
- about the relative order of evaluation of the calls on function with regard to
- execution in other parts of a given series expression.
-
- (map-fn 'integer #'+ #Z(1 2 3) #Z(4 5)) => #Z(5 7)
- (map-fn t #'gensym) => #Z(#:G3 #:G4 #:G5 ...)
- (map-fn '(values integer rational) #'floor #Z(1/4 9/5 12/3))
- => #Z(0 1 4) and #Z(1/4 4/5 0)
-
- The # macro character syntax #M makes it easy to specify uses of map-fn where
- type is t and the function is a named function. The notation (#Mfunction ...)
- is an abbreviation for (map-fn t #'function ...). The form function can be the
- printed representation of any Lisp object. The notation #Mfunction can appear
- only in the function position of a list.
-
- (collect (#M1+ (scan '(1 2 3)))) => (2 3 4)
-
- [Macro]
- mapping ({({var | ({var}*)} value)}*) {declaration}* {form}*
-
- The macro mapping makes it easy to specify uses of map-fn where type is t and
- the function is a literal lambda. The syntax of mapping is analogous to that of
- let. The binding list specifies zero or more variables that are bound in
- parallel to successive values of series. The value part of each pair is an
- expression that must produce a series. The declarations and forms are treated
- as the body of a lambda expression that is mapped over the series values. A
- series of the first values returned by this lambda expression is returned as
- the result of mapping.
-
- (mapping ((x r) (y s)) ...) ==
- (map-fn t #'(lambda (x y) ...) r s)
- (mapping ((x (scan '(2 -2 3))))
- (expt (abs x) 3))
- => #Z(8 8 27)
-
- The form mapping supports a special syntax that facilitates the use of series
- functions returning multiple values. Instead of being a single variable, the
- variable part of a var-value pair can be a list of variables. This list is
- treated the same way as the first argument to multiple-value-bind and can be
- used to access the elements of multiple series returned by a series function.
-
- (mapping (((i v) (scan-plist '(a 1 b 2))))
- (list i v))
- => #Z((a 1) (b 2))
-
- [Macro]
- iterate ({({var | ({var}*)} value)}*) {declaration}* {form}*
-
- The form iterate is the same as mapping, except that after mapping the forms
- over the values, the results are discarded and nil is returned.
-
- (let ((item (scan '((1) (-2) (3)))))
- (iterate ((x (#Mcar item)))
- (if (plusp x) (prin1 x))))
- => nil (after printing ``13'')
-
- To a first approximation, iterate and mapping differ in the same way as mapc
- and mapcar. In particular, like mapc, iterate is intended to be used in
- situations where the forms are being evaluated for side effects rather than for
- their results. However, given the lazy evaluation semantics of series, the
- difference between iterate and mapping is more than just a question of
- efficiency.
-
- If mapcar is used in a situation where the output is not used, time is wasted
- unnecessarily creating the output list. However, if mapping is used in a
- situation where the output is not used, no computation is performed, because
- series elements are not computed until they are used. Thus iterate can be
- thought of as a declaration that the indicated computation is to be performed
- even though the output is not used for anything.
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.2.3. Truncation and Other Simple Transducers
-
- [change_begin]
- Transducers compute series from series and form the heart of most series
- expressions. Mapping is by far the most common transducer. This section
- presents a number of additional simple transducers.
-
- [Function]
- cotruncate &rest series-inputs
- until bools &rest series-inputs
- until-if pred &rest series-inputs
-
- Each of these functions accepts one or more series inputs S1, ..., Sn as its
- &rest argument and returns n series outputs T1, ..., Tn that contain the same
- elements in the same order-that is, Ti =Si . Let k be the length of the
- shortest input Si. cotruncate truncates the series so that each output has
- length k. Let be the position of the first element in the boolean series
- bools that is not nil or, if every element is nil, the length of bools. until
- truncates the series so that each output has length (min k ). Let be the
- position of the first element in S1 such that (pred S1 ) is not nil or, if
- there is no such element, the length of S1. until-if truncates the series so
- that each output has length (min k ).
-
- (cotruncate #Z(1 2 -3 4) #Z(a b c))
- => #Z(1 2 -3) and #Z(a b c)
- (until #Z(nil nil t nil) #Z(1 2 -3 4) #Z(a b c))
- => #Z(1 2) and #Z(a b)
- (until-if #'minusp #Z(1 2 -3 4) #Z(a b c))
- => #Z(1 2) and #Z(a b)
-
- [Function]
- previous items &optional (default nil) (amount 1)
-
- The series returned by previous is the same as the input series items except
- that it is shifted to the right by the positive integer amount. The shifting is
- done by inserting amount copies of default before items and discarding amount
- elements from the end of items.
-
- (previous #Z(10 11 12) 0) => #Z(0 10 11)
-
- [Function]
- latch items &key :after :before :pre :post
-
- The series returned by latch is the same as the input series items except that
- some of the elements are replaced by other values. latch acts like a latch
- electronic circuit component. Each input element causes the creation of a
- corresponding output element. After a specified number of non-null input
- elements have been encountered, the latch is triggered and the output mode is
- permanently changed.
-
- The :after and :before arguments specify the latch point. The latch point is
- just after the :after-th non-null element in items or just before the
- :before-th non-null element. If neither :after nor :before is specified, an
- :after of 1 is assumed. If both are specified, it is an error.
-
- If a :pre is specified, every element prior to the latch point is replaced by
- this value. If a :post is specified, every element after the latch point is
- replaced by this value. If neither is specified, a :post of nil is assumed.
-
- (latch #Z(nil c nil d e)) => #Z(nil c nil nil nil)
- (latch #Z(nil c nil d e) :before 2 :post t) => #Z(nil c nil t t)
-
- [Function]
- collecting-fn type init function &rest series-inputs
-
- The higher-order function collecting-fn supports the general concept of a
- simple transducer with internal state. The type argument is a type specifier
- indicating the type of values returned by function. The values construct can be
- used to indicate multiple types; however, type cannot indicate zero values. If
- type indicates m types , then collecting-fn returns m series T1, ..., Tm,
- where Ti has the type (series ). The arguments init and function are
- functions. The remaining arguments (if any) are all series. Let these series be
- S1, ..., Sn and suppose that Si has the type (series ).
-
- The init must be of type (function () (values ... )).
-
- The function must be of type
-
- (function ( ... ... ) (values ... ))
-
- The length of each output is the same as the length of the shortest input. If
- there are no bounded series inputs, the outputs are unbounded. The elements of
- the Ti are computed as follows:
-
- (values T1 ... Tm ) ==
- (multiple-value-call function (funcall init) S1 ... Sn )
-
- (values T1 ... Tm ) ==
- (funcall function T1 ... Tm S1 ... Sn )
-
- If init or function has side effects, it can count on being called in the order
- indicated by the equations above. However, given the lazy evaluation nature of
- series, these functions will not be called until their outputs are actually
- used (if ever). In addition, no assumptions can be made about the relative
- order of evaluation of these calls with regard to execution in other parts of a
- given series expression. The second example below computes a series of partial
- sums of the numbers in an input series. The third example computes two output
- series: the partial sums of its first input and the partial products of its
- second input.
-
- (defun running-averages (float-list)
- (multiple-value-call #'map-fn
- 'float #'/
- (collecting-fn '(values float integer)
- #'(lambda () (values 0.0 0)
- #'(lambda (s n x) (values (+ s x) (+ n 1))))
- float-list)))
-
- (collecting-fn 'integer #'(lambda () 0) #'+ #Z(1 2 3))
- => #Z(1 3 6)
-
- (collecting-fn '(values integer integer)
- #'(lambda () (values 0 1))
- #'(lambda (sum prod x y)
- (values (+ sum x) (* prod y)))
- #Z(4 6 8)
- #Z(1 2 3))
- => #Z(4 10 18) and #Z(1 2 6)
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.2.4. Conditional and Other Complex Transducers
-
- [change_begin]
- This section presents a number of complex transducers, including ones that
- support conditional computation.
-
- [Function]
- choose bools &optional (items bools)
- choose-if pred items
-
- Each of these functions takes in a series of elements (items) and returns a
- series containing the same elements in the same order, but with some elements
- removed. choose removes itemsj if boolsj is nil or j is beyond the end of
- bools. If items is omitted, choose returns the non-null elements of bools.
- choose-if removes itemsj if (pred itemsj) is nil.
-
- (choose #Z(t nil t nil) #Z(a b c d)) => #Z(a c)
- (collect-sum (choose-if #'plusp #Z(-1 2 -3 4))) => 6
-
- [Function]
- expand bools items &optional (default nil)
-
- expand is a quasi-inverse of choose. The output contains the elements of the
- input series items spread out into the positions specified by the non-null
- elements in bools-that is, itemsj is in the position occupied by the jth
- non-null element in bools. The other positions in the output are occupied by
- default. The output stops as soon as bools runs out of elements or a non-null
- element in bools is encountered for which there is no corresponding element in
- items.
-
- (expand #Z(nil t nil t t) #Z(a b c)) => #Z(nil a nil b c)
- (expand #Z(nil t nil t t) #Z(a)) => #Z(nil a nil)
-
- [Function]
- split items &rest test-series-inputs
- split-if items &rest test-predicates
-
- These functions are like choose and choose-if except that instead of producing
- one restricted output, they partition the input series items between several
- outputs. If there are n test inputs following items, then there are n+1
- outputs. Each input element is placed in exactly one output series, depending
- on the outcome of a sequence of tests. If the element itemsj fails the first
- k-1 tests and passes the kh test, it is put in the kth output. If itemsj fails
- every test, it is placed in the last output. In addition, all output stops as
- soon as any series input runs out of elements. The test inputs to split are
- series of values; itemsj passes the kth test if the jth element of the kth test
- series is not nil. The test inputs to split-if are predicates; itemsj passes
- the kth test if the kth test predicate returns non-null when applied to itemsj.
-
- (split #Z(-1 2 3 -4) #Z(t nil nil t))
- => #Z(-1 -4) and #Z(2 3)
- (multiple-value-bind (+x -x) (split-if #Z(-1 2 3 -4) #'plusp)
- (values (collect-sum +x) (collect-sum -x)))
- => 5 and -5
-
- [Function]
- catenate &rest series-inputs
-
- catenate combines two or more series into one long series by appending them end
- to end. The length of the output is the sum of the lengths of the inputs.
-
- (catenate #Z(b c) #Z() #Z(d)) => #Z(b c d)
-
- [Function]
- subseries items start &optional below
-
- subseries returns a series containing the elements of the input series items
- indexed by the non-negative integers from start up to, but not including,
- below. If below is omitted or greater than the length of items, the output goes
- all the way to the end of items.
-
- (subseries #Z(a b c d) 1) => #Z(b c d)
- (subseries #Z(a b c d) 1 3) => #Z(b c)
-
- [Function]
- positions bools
-
- positions returns a series of the indices of the non-null elements in the
- series input bools.
-
- (positions #Z(t nil t 44)) => #Z(0 2 3)
-
- [Function]
- mask monotonic-indices
-
- mask is a quasi-inverse of positions. The series input monotonic-indices must
- be a strictly increasing series of non-negative integers. The output, which is
- always unbounded, contains t in the positions specified by monotonic-indices
- and nil everywhere else.
-
- (mask #Z(0 2 3)) => #Z(t nil t t nil nil ...)
- (mask #Z()) => #Z(nil nil ...)
- (mask (positions #Z(nil a nil b nil)))
- => #Z(nil t nil t nil ...)
-
- [Function]
- mingle items1 items2 comparator
-
- The series returned by mingle contains all and only the elements of the two
- input series. The length of the output is the sum of the lengths of the inputs
- and is unbounded if either input is unbounded. The order of the elements
- remains unchanged; however, the elements from the two inputs are stably
- intermixed under the control of the comparator.
-
- The comparator must accept two arguments and return non-null if and only if its
- first argument is strictly less than its second argument (in some appropriate
- sense). At each step, the comparator is used to compare the current elements in
- the two series. If the current element from items2 is strictly less than the
- current element from items1, the current element is removed from items2 and
- transferred to the output. Otherwise, the next output element comes from
- items1.
-
- (mingle #Z(1 3 7 9) #Z(4 5 8) #'<) => #Z(1 3 4 5 7 8 9)
- (mingle #Z(1 7 3 9) #Z(4 5 8) #'<) => #Z(1 4 5 7 3 8 9)
-
- [Function]
- chunk m n items
-
- This function has the effect of breaking up the input series items into
- (possibly overlapping) chunks of length m. The starting positions of successive
- chunks differ by n. The inputs m and n must both be positive integers.
-
- chunk produces m output series. The ith chunk provides the ith element for each
- of the m outputs. Suppose that the length of items is l. The length of each
- output is . The ith element of the kth output is the (i*n+k)th element of
- items (i and k counting from zero).
-
- Note that if l<m, there will be no output elements, and if l-m is not a
- multiple of n, the last few input elements will not appear in the output. If
- mn, one can guarantee that the last chunk will contain the last element of
- items by catenating
- n-1 copies of an appropriate padding value to the end of items.
-
- The first example below shows chunk being used to compute a moving average. The
- second example shows chunk being used to convert a property list into an
- association list.
-
- (mapping (((xi xi+1 xi+2) (chunk 3 1 #Z(1 5 3 4 5 6))))
- (/ (+ xi xi+1 xi+2) 3))
- => #Z(3 4 4 5)
-
- (collect
- (mapping (((prop val) (chunk 2 2 (scan '(a 2 b 5 c 8)))))
- (cons prop val)))
- => ((a . 2) (b . 5) (c . 8))
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.2.5. Collectors
-
- [change_begin]
- Collectors produce non-series outputs based on series inputs. They either
- create a summary value based on some formula (the sum, for example) or collect
- the elements of a series in an aggregate data structure (such as a list).
-
- [Function]
- collect-first items &optional (default nil)
- collect-last items &optional (default nil)
- collect-nth n items &optional (default nil)
-
- Given a series items, these functions return the first element, the last
- element, and the nth element, respectively. If items has no elements (or no nth
- element), default is returned. If default is not specified, then nil is used
- for default.
-
- (collect-first #Z() 'z) => z
- (collect-last #Z(a b c)) => c
- (collect-nth 1 #Z(a b c)) => b
-
- [Function]
- collect-length items
-
- collect-length returns the number of elements in a series.
-
- (collect-length #Z(a b c)) => 3
-
- [Function]
- collect-sum numbers &optional (type 'number)
-
- collect-sum returns the sum of the elements in a series of numbers. The type is
- a type specifier that indicates the type of sum to be created. If type is not
- specified, then number is used for the type. If there are no elements in the
- input, a zero (of the appropriate type) is returned.
-
- (collect-sum #Z(1.1 1.2 1.3)) => 3.6
- (collect-sum #Z() 'complex) => #C(0 0)
-
- [Function]
- collect-max numbers
- collect-min numbers
-
- Given a series of non-complex numbers, these functions compute the maximum
- element and the minimum element, respectively. If there are no elements in the
- input, nil is returned.
-
- (collect-max #Z(2 1 4 3)) => 4
- (collect-min #Z(1.2 1.1 1.4 1.3)) => 1.1
- (collect-min #Z()) => nil
-
- [Function]
- collect-and bools
-
- collect-and returns the and of the elements in a series. As with the macro and,
- nil is returned if any element of bools is nil. Otherwise, the last element of
- bools is returned. The value t is returned if there are no elements in bools.
-
- (collect-and #Z(a b c)) => c
- (collect-and #Z(a nil c)) => nil
-
- [Function]
- collect-or bools
-
- collect-or returns the or of the elements in a series. As with the macro or,
- nil is returned if every element of bools is nil. Otherwise, the first non-null
- element of bools is returned. The value nil is returned if there are no
- elements in bools.
-
- (collect-or #Z(nil b c)) => b
- (collect-or #Z()) => nil
-
- [Function]
- collect items
- collect type items
-
- collect returns a sequence containing the elements of the series items. The
- type is a type specifier indicating the type of sequence to be created. It must
- be either a proper subtype of sequence or the symbol bag. If type is omitted,
- it defaults to list. (This function exhibits an argument pattern that is
- unusual for Common Lisp: an ``optional'' argument preceding a required
- argument. This pattern cannot be expressed in the usual manner with &optional.
- It is indicated above by two definition lines, showing the two possible
- argument patterns.)
-
- If the type is bag, a list is created with the elements in whatever order can
- be most efficiently obtained. Otherwise, the order of the elements in the
- sequence is the same as the order in items. If type specifies a length (that
- is, of a vector) this length must be greater than or equal to the length of
- items.
-
- The nth element of items is placed in the nth slot of the sequence produced.
- Any unneeded slots are left in their initial state. Collecting is significantly
- more efficient if it can be determined at compile time whether type is a
- subtype of list or vector and for vectors what the length of the vector is.
-
- (collect #Z(a b c)) => (a b c)
- (collect 'bag #Z(a b c)) => (c a b) or (b a c) or ...
- (collect '(vector integer 3) #Z(1 2 3)) => #(1 2 3)
-
- [Function]
- collect-append sequences
- collect-append type sequences
-
- Given a series of sequences, collect-append returns a new sequence by
- concatenating these sequences together in order. The type is a type specifier
- indicating the type of sequence created and must be a proper subtype of
- sequence. If type is omitted, it defaults to list. (This function exhibits an
- argument pattern that is unusual for Common Lisp: an ``optional'' argument
- preceding a required argument. This pattern cannot be expressed in the usual
- manner with &optional. It is indicated above by two definition lines, showing
- the two possible argument patterns.)
-
- It must be possible for every element of every sequence in the input series to
- be an element of a sequence of type type. The result does not share any
- structure with the sequences in the input.
-
- (collect-append #Z((a b) nil (c d))) => (a b c d)
- (collect-append 'string #Z("a " "big " "cat")) => "a big cat"
-
- [Function]
- collect-nconc lists
-
- collect-nconc nconcs the elements of the series lists together in order and
- returns the result. This is the same as collect-append except that the input
- must be a series of lists, the output is always a list, the concatenation is
- done rapidly by destructively modifying the input elements, and therefore the
- output shares all of its structure with the input elements.
-
- [Function]
- collect-alist keys values
- collect-plist keys values
- collect-hash keys values &key :test :size :rehash-size :rehash-threshold
-
- Given a series of keys and a series of corresponding values, these functions
- return an association list, a property list, and a hash table, respectively.
- Following the order of the input, each keys -values pair is entered into the
- output so that it overrides all earlier associations. If one of the input
- series is longer than the other, the extra elements are ignored. The keyword
- arguments of collect-hash specify attributes of the hash table produced and
- have the same meanings as the arguments to make-hash-table.
-
- (collect-alist #Z(a b c) #Z(1 2)) => ((b . 2) (a . 1))
- (collect-plist #Z(a b c) #Z(1 2)) => (b 2 a 1)
- (collect-hash #Z() #Z(1 2) :test #'eq) => an empty hash table
-
- [Function]
- collect-file file-name items &optional (printer #'print)
-
- This creates a file named file-name and writes the elements of the series items
- into it using the function printer. Printer must accept two inputs: an object
- and an output stream. (For instance, printer can be print, prin1, princ,
- pprint, write-char, write-string, or write-line.) If omitted, printer defaults
- to print. The value t is returned. The file is correctly closed, even if an
- abort occurs.
-
- [Function]
- collect-fn type init function &rest series-inputs
-
- The higher-order function collect-fn supports the general concept of
- collecting. It is identical to collecting-fn except that it returns only the
- last element of each series computed. If there are no elements in these series,
- the values returned by init are passed on directly as the output of collect-fn.
-
- (collect-fn 'integer #'(lambda () 0) #'+ #Z(1 2 3)) => 6
- (collect-fn 'integer #'(lambda () 0) #'+ #Z()) => 0
- (collect-fn 'integer #'(lambda () 1) #'* #Z(1 2 3 4 5)) => 120
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.2.6. Alteration of Series
-
- [change_begin]
- Series that come from scanning data structures such as lists and vectors are
- closely linked to these structures. The function alter can be used to modify
- the underlying data structure with reference to the series derived from it.
- (Conversely, it is possible to modify a series by destructively modifying the
- data structure it is derived from. However, given the lazy evaluation nature of
- series, the effects of such modifications can be very hard to predict. As a
- result, this kind of modification is inadvisable.)
-
- [Function]
- alter destinations items
-
- alter changes the series destinations so that it contains the elements in the
- series items. More importantly, in the manner of setf, the data structure that
- underlies destinations is changed so that if the series destinations were to be
- regenerated, the new values would be obtained. The alteration process stops as
- soon as either input runs out of elements. The value nil is always returned. In
- the example below each negative element in a list is replaced with its square.
-
- (let* ((data (list 1 -2 3 4 -5 6))
- (x (choose-if #'minusp (scan data))))
- (alter x (#M* x x))
- data)
- => (1 4 3 4 25 6)
-
- alter can be applied only to series that are alterable. scan, scan-alist,
- scan-multiple, scan-plist, and scan-lists-of-lists-fringe produce alterable
- series. However, the alterability of the output of scan-lists-of-lists-fringe
- is incomplete. If scan-lists-of-lists-fringe is applied to an object that is a
- leaf, altering the output series does not change the object.
-
- In general, the output of a transducer is alterable as long as the elements of
- the output come directly from the elements of an input that is alterable. In
- particular, the outputs of choose, choose-if, split, split-if, cotruncate,
- until, until-if, and subseries are alterable as long as the corresponding
- inputs are alterable.
-
- [Function]
- to-alter items alter-fn &rest args
-
- Given a series items, to-alter returns an alterable series A containing the
- same elements. The argument alter-fn is a function. The remaining arguments are
- all series. Let these series be S1, ..., Sn. If there are n arguments after
- alter-fn, alter-fn must accept n+1 inputs. If (alter A B) is later encountered,
- the expression (map-fn t alter-fn B S1 ... Sn) is implicitly evaluated. For
- each element in B, alter-fn should make appropriate changes in the data
- structure underlying A.
-
- As an example, consider the following definition of a series function that
- scans the elements of a list. Alteration is performed by changing cons cells in
- the list being scanned.
-
- (defun scan-list (list)
- (declare (optimizable-series-function))
- (let ((sublists (scan-sublists list)))
- (to-alter (#Mcar sublists)
- #'(lambda (new parent) (setf (car parent) new))
- sublists)))
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.3. Optimization
-
- [change_begin]
- Series expressions are transformed into loops by pipelining them-the
- computation is converted from a form where entire series are computed one after
- the other to a form where the series are incrementally computed in parallel. In
- the resulting loop, each individual element is computed just once, used, and
- then discarded before the next element is computed. For this pipelining to be
- possible, a number of restrictions have to be satisfied. Before these
- restrictions are explained, it will be useful to consider a related issue.
-
- The composition of two series functions cannot be pipelined unless the
- destination function consumes series elements in the same order that the source
- function produces them. Taken together, the series functions guarantee that
- this will always be true, because they all follow the same fixed processing
- order. In particular, they are all preorder functions-they process the elements
- of their series inputs and outputs in ascending order starting with the first
- element. Further, while it is easy for users to define new series functions, it
- is impossible to define one that is not a preorder function.
-
- It turns out that most series operations can easily be implemented in a
- preorder fashion, the most notable exceptions being reversal and sorting. As a
- result, little is lost by outlawing non-preorder series functions. If some
- non-preorder operation has to be applied to a series, the series can be
- collected into a list or vector and the operation applied to this new data
- structure. (This is inefficient, but no less efficient than what would be
- required if non-preorder series functions were supported.)
- [change_end]
-
- -------------------------------------------------------------------------------
-
- * Basic Restrictions
- * Constraint Cycles
- * Defining New Series Functions
- * Declarations
-
- -------------------------------------------------------------------------------
-
- A.3.1. Basic Restrictions
-
- [change_begin]
- The transformation of series expressions into loops is required to occur at
- some time before compiled code is actually run. Optimization may or may not be
- applied to interpreted code. If any of the restrictions described below are
- violated, optimization is not possible. In this situation, a warning message is
- issued at the time optimization is attempted and the code is left unoptimized.
- This is not a fatal error and does not prevent the correct results from being
- computed. However, given the large improvements in efficiency to be gained, it
- is well worth fixing any violations that occur. This is usually easy to do.
-
- [Variable]
- *suppress-series-warnings*
-
- If this variable is set (or bound) to anything other than its default value of
- nil, warnings about conditions that block the optimization of series
- expressions are suppressed.
-
- Before the restrictions on series expressions are discussed, it will be useful
- to define precisely what is meant by the term series expression. This term is
- semantic rather than syntactic in nature. Imagine a program converted from Lisp
- code into a data flow graph. In a data flow graph, functions are represented as
- boxes, and both control flow and data flow are represented as arrows between
- the boxes. Constructs such as let and setq are converted into patterns of data
- flow arcs. Control constructs such as if and loop are converted into patterns
- of control flow arcs. Suppose further that all loops have been converted into
- tail recursions so that the graph is acyclic.
-
- A series expression is a subgraph of the data flow graph for a program that
- contains a group of interacting series functions. More specifically, given a
- call f on a series function, the series expression E containing it is defined
- as follows. E contains f. Every function using a series created by a function
- in E is in E. Every function computing a series used by a function in E is in
- E. Finally, suppose that two functions g and h are in E and that there is a
- data flow path consisting of series and/or non-series data flow arcs from g to
- h. Every function touched by this path (be it a series function or not) is in
- E.
-
- For optimization to be possible, series expressions have to be statically
- analyzable. As with most other optimization processes, a series expression
- cannot be transformed into a loop at compile time, unless it can be determined
- at compile time exactly what computation is being performed. This places a
- number of relatively minor limits on what can be written. For example, for
- optimization to be possible the type arguments to higher-order functions such
- as map-fn and collecting-fn have to be quoted constants. Similarly, the numeric
- arguments to chunk have to be constants. In addition, if funcall is used to
- call a series function, the function called has to be of the form (function
- ...).
-
- For optimization to be possible, every series created within a series
- expression must be used solely inside the expression. If a series is
- transmitted outside of the expression that creates it, it has to be physically
- represented as a whole. This is incompatible with the transformations required
- to pipeline the creating expression. To avoid this problem, a series must not
- be returned as a result of a series expression as a whole, assigned to a free
- variable, assigned to a special variable, or stored in a data structure. A
- corollary of the last point is that when defining new optimizable series
- functions, series cannot be passed into &rest arguments. Further, optimization
- is blocked if a series is passed as an argument to an ordinary Lisp function.
- Series can be passed only to the series functions in section A.2 and to new
- series functions defined using the declaration optimizable-series-function.
-
- For optimization to be possible, series expressions must correspond to
- straight-line computations. That is to say, the data flow graph corresponding
- to a series expression cannot contain any conditional branches. (Complex
- control flow is incompatible with pipelining.) Optimization is possible in the
- presence of standard straight-line forms such as progn, funcall, setq, lambda,
- let, let*, and multiple-value-bind as long as none of the variables bound are
- special. There is also no problem with macros as long as they expand into
- series functions and straight-line forms. However, optimization is blocked by
- forms that specify complex control flow (i.e., conditionals if, cond, etc.,
- looping constructs loop, do, etc., or branching constructs tagbody, go, catch,
- etc.).
-
- In the first example below, optimization is blocked, because the if form is
- inside the series expression. In the second example, however, optimization is
- possible, because although the if feeds data to the series expression, it is
- not inside the corresponding subgraph. Both of the expressions below produce
- the same value, but the second one is much more efficient.
-
- (collect (if flag (scan x) (scan y))) ;Warning message issued
- (collect (scan (if flag x y)))
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.3.2. Constraint Cycles
-
- [change_begin]
- Even if a series expression satisfies all of the restrictions above, it still
- may not be possible to transform the expression into a loop. The sole remaining
- problem is that if a series is used in two places, the two uses may place
- incompatible constraints on the times at which series elements should be
- produced.
-
- The series expression below shows a situation where this problem arises. The
- expression creates a series x of the elements in a list. It then creates a
- normalized series by dividing each element of x by the sum of the elements in
- x. Finally, the expression returns the maximum of the normalized elements.
-
- (let ((x (scan '(1 2 5 2)))) ;Warning message issued
- (collect-max (#M/ x (series (collect-sum x))))) => 1/2
-
- The two uses of x in the expression place contradictory constraints on the way
- pipelined evaluation must proceed; collect-sum requires that all of the
- elements of x be produced before the sum can be returned, and series requires
- that its input be available before it can start to produce its output. However,
- #M/ requires that the first element of x be available at the same time as the
- first element of the output of series. For pipelining to work, the first
- element of the output of series (and therefore the output of collect-sum) must
- be available before the second element of x is produced. Unfortunately, this is
- impossible.
-
- The essence of the inconsistency above is the cycle of constraints used in the
- argument. This in turn stems from a cycle in the data flow graph underlying the
- expression. In figure A-1 function calls are represented by boxes and data flow
- is represented by arrows. Simple arrows indicate the flow of series values and
- cross-hatched arrows indicate the flow of non-series values.
-
-
-
- ----------------------------------------------------------------
- Figure A-1: A Constraint Cycle in a Series Expression
-
- +------+ +-----+ +-----+
- -->| scan |----------------------------------->| #M/ |-->| max |-->
- +------+ \ +-----+ +--------+ / +-----+ +-----+
- -->| sum |-++++->| series |--
- +-----+ +--------+
-
- ----------------------------------------------------------------
-
- Given a data flow graph corresponding to a series expression, a constraint
- cycle is a closed oriented loop of data flow arcs such that each arc is
- traversed exactly once and no non-series arc is traversed backward. (Series
- data flow arcs can be traversed in either direction.) A constraint cycle is
- said to pass through an input or output port when exactly one of the arcs in
- the cycle touches the port. In figure A-1 the data flow arcs touching scan,
- sum, series, and #M/ form a constraint cycle. Note that if the output of scan
- were not a series, this loop would not be a constraint cycle, because there
- would be no valid way to traverse it. Also note that while the constraint cycle
- passes through all the other ports it touches, it does not pass through the
- output of scan.
-
- Whenever a constraint cycle passes through a non-series output, an argument
- analogous to the one above can be constructed and therefore pipelining will be
- impossible. When this situation arises, a warning message is issued identifying
- the problematical port and the cycle passing through it. For instance, the
- warning triggered by the example above states that the constraint cycle
- associated with scan, collect-sum, series, and #M/ passes through the
- non-series output of collect-sum.
-
- Given this kind of detailed information, it is easy to alleviate the problem.
- To start with, every cycle must contain at least one function that has two
- series data flows leaving it. At worst, the cycle can be broken by duplicating
- this function (and any functions computing series used by it). For instance,
- the example above can be rewritten as shown below.
-
- (let ((x (scan '(1 2 5 2)))
- (sum (collect-sum (scan '(1 2 5 2)))))
- (collect-max (#M/ x (series sum))))
- => 1/2
-
- It would be easy enough to automatically apply code copying to break
- problematical constraint cycles. However, this is not done for two reasons.
- First, there is considerable virtue in maintaining the property that each
- function in a series expression turns into one piece of computation in the loop
- produced. Users can be confident that series expressions that look simple and
- efficient actually are simple and efficient. Second, with a little creativity,
- constraint problems can often be resolved in ways that are much more efficient
- than copying code. In the example above, the conflict can be eliminated
- efficiently by interchanging the operation of computing the maximum with the
- operation of normalizing an element.
-
- (let ((x (scan '(1 2 5 2))))
- (/ (collect-max x) (collect-sum x))) => 1/2
-
- The restriction that optimizable series expressions cannot contain constraint
- cycles that pass through non-series outputs places limitations on the
- qualitative character of optimizable series expressions. In particular, they
- all must have the general form of creating some number of series using
- scanners, computing various intermediate series using transducers, and then
- computing one or more summary results using collectors. The output of a
- collector cannot be used in the intermediate computation unless it is the
- output of a separate subexpression.
-
- It is worthy of note that the last expression above fixes the constraint
- conflict by moving the non-series output out of the cycle, rather than by
- breaking the cycle. This illustrates the fact that constraint cycles that do
- not pass through non-series outputs do not necessarily cause problems. They
- cause problems only if they pass through off-line ports.
-
- A series input port or series output port of a series function is on-line if
- and only if it is processed in lockstep with all the other on-line ports as
- follows: the initial element of each on-line input is read, then the initial
- element of each on-line output is written, then the second element of each
- on-line input is read, then the second element of each on-line output is
- written, and so on. Ports that are not on-line are off-line. If all of the
- series ports of a function are on-line, the function is said to be on-line;
- otherwise, it is off-line. (The above extends the standard definition of the
- term on-line so that it applies to individual ports as well as whole
- functions.)
-
- If all of the ports a cycle passes through are on-line, the lockstep processing
- of these ports guarantees that there cannot be any conflicts between the
- constraints associated with the cycle. However, passing through an off-line
- port leads to the same kinds of problems as passing through a non-series
- output.
-
- Most of the series functions are on-line. In particular, scanners and
- collectors are all on-line as are many transducers. However, the transducers in
- section A.2.4 are off-line. In particular, the series inputs of catenate,
- choose-if, chunk, expand, mask, mingle, positions, and subseries along with the
- series outputs of choose, split, and split-if are off-line.
-
- In summary, the fourth and final restriction is that for optimization to be
- possible, a series expression cannot contain a constraint cycle that passes
- through a non-series output or an off-line port. Whenever this restriction is
- violated a warning message is issued. Violations can be fixed either by
- breaking the cycle or restructuring the computation so that the offending port
- is removed from the cycle.
- [change_end]
-
- -------------------------------------------------------------------------------
-
- A.3.3. Defining New Series Functions
-
- [change_begin]
- New functions operating on series can be defined just as easily as new
- functions operating on any other data type. However, expressions containing
- these new functions cannot be transformed into loops unless a complete analysis
- of the functions is available. Among other things, this implies that the
- definition of a new series function must appear before its first use.
-
- [Declaration specifier]
- optimizable-series-function
-
- The declaration specifier (optimizable-series-function integer) indicates that
- the function being defined is a series function that needs to be analyzed so
- that it can be optimized when it appears in series expressions. (A warning is
- issued if the function being defined neither takes a series as input nor
- produces a series as output.) Integer (default 1) specifies the number of
- values returned by the function being defined. (This cannot necessarily be
- determined by local analysis.) The only place optimizable-series-function is
- allowed to appear is in a declaration immediately inside a defun. As an
- example, the following shows how a simplified version of collect-sum could be
- defined.
-
- (defun simple-collect-sum (numbers)
- (declare (optimizable-series-function 1))
- (collect-fn 'number #'(lambda () 0) #'+ numbers))
-
- [Declaration specifier]
- off-line-port
-
- The declaration specifier (off-line-port port-spec1 port-spec2 ...) specifies
- that the indicated inputs and outputs are off-line. This declaration specifier
- is only allowed in a defun that contains the declaration
- optimizable-series-function. Each port-spec must either be a symbol that is one
- of the inputs of the function or an integer j indicating the jth output
- (counting from zero). For example, (off-line-port x 1) indicates that the input
- x and the secon a large
- number of useful operations that can be performed on series. However, this
- functionality can be boiled down to a small number of primitive constructs.
-
- collecting-fn embodies the fundamental idea of series computations that utilize
- internal state. It can be used as the basis for defining any on-line
- transducer.
-
- until embodies the fundamental idea of producing a series that is shorter than
- the shortest input series. In particular, it embodies the idea of computing a
- bounded series from non-series inputs. Together with collecting-fn, until can
- be used to define scan-fn, which can be used as the basis for defining all the
- other scanners.
-
- collect-last embodies the fundamental idea of producing a non-series value from
- a series. Together with collecting-fn, it can be used to define collect-fn,
- which (with the occasional assistance of until) can be used as the basis for
- defining all the other collectors.
-
- producing embodies the fundamental idea of preorder computation. It can be used
- as the basis for defining all the other series functions, including the
- off-line transducers.
-
- In addition to the above, four primitives support various specialized aspects
- of series functions. Alterability is supported by the function to-alter and the
- declaration propagate-alterability. The propagation of type information is
- supported by the type specifier series-element-type. The best implementation of
- certain series functions requires the form encapsulated.
-
- [Macro]
- producing output-list input-list {declaration}* {form}*
-
- producing computes and returns a group of series and non-series outputs given a
- group of series and non-series inputs. The key feature of producing is that
- some or all of the series inputs and outputs can be processed in an off-line
- way. To support this, the processing in the body (consisting of the forms) is
- performed from the perspective of generators and gatherers (see appendix B).
- Each series input is converted to a generator before being used in the body.
- Each series output is associated with a gatherer in the body.
-
- The output-list has the same syntax as the binding list of a let. The names of
- the variables must be distinct from each other and from the names of the
- variables in the input-list. If there are n variables in the output-list,
- producing computes n outputs. There must be at least one output variable. The
- variables act as the names for the outputs and can be used in either of two
- ways. First, if an output variable has a value associated with it in the
- output-list, then the variable is treated as holding a non-series value. The
- variable is initialized to the indicated value and can be used in any way
- desired in the body. The eventual output value is whatever value is in the
- variable when the execution of the body terminates. Second, if an output
- variable does not have a value associated with it in the output-list, the
- variable is given as its value a gatherer that collects elements. The only
- valid way to use the variable in the body is in a call on next-out. The output
- returned is a series containing these elements. If the body never terminates,
- this series is unbounded.
-
- The input-list also has the same syntax as the binding list of a let. The names
- of the variables must be distinct from each other and the names of the
- variables in the output-list. The values can be series or non-series. If the
- value is not explicitly specified, it defaults to nil. The variables act
- logically both as inputs and state variables and can be used in one of two
- ways. First, if an input variable is associated with a non-series value, then
- it is given this value before the evaluation of the body begins and can be used
- in any way desired in the body. Second, if an input variable is associated with
- a series, then the variable is given a generator corresponding to this series
- as its initial value. The only valid way to use the variable in the body is in
- a call on next-in.
-
- There can be declarations at the start of the body. However, the only
- declarations allowed are ignore declarations, type declarations, and
- propagate-alterability declarations (see below). In particular, it is an error
- for any of the input or output variables to be special.
-
- In conception, the body can contain arbitrary Lisp expressions. After the
- appropriate generators and gatherers have been set up, the body is executed
- until it terminates. If the body never terminates, the series outputs (if any)
- are unbounded in length and the non-series outputs (if any) are never produced.
-
- Although easy to understand, this view of what can happen in the body presents
- severe difficulties when optimizing (and even when evaluating) series
- expressions that contain calls on producing. As a result, several limitations
- are imposed on the form of the body to simplify the processing required.
-
- The first limitation is that, exclusive of any declarations, the body must have
- the form (loop (tagbody ...)). The following example shows how producing could
- be used to implement a scanner creating an unbounded series of integers.
-
- (producing (nums) ((num 0))
- (declare (integer num) (type (series integer) nums))
- (loop
- (tagbody
- (setq num (1+ num))
- (next-out nums num))))
- => #Z(1 2 3 4 ...)
-
- The second limitation is that the form terminate-producing must be used to
- terminate the execution of the body. Any other method of terminating the body
- (with return, for example) is an error. The following example shows how
- producing could be used to implement the operation of summing a series. The
- function terminate-producing is used to stop the computation when numbers runs
- out of elements.
-
- (producing ((sum 0)) ((numbers #Z(1 2 3)) num)
- (loop
- (tagbody
- (setq num (next-in numbers (terminate-producing)))
- (setq sum (+ sum num)))))
- => 6
-
- The third limitation is that calls on next-out associated with output variables
- must appear at top level in the tagbody in the body. They cannot be nested in
- other forms. In addition, an output variable can be the destination of at most
- one call on next-out and if it is the destination of a next-out, it cannot be
- used in any other way.
-
- If the call on next-out for a given output appears in the final part of the
- tagbody in the body, after everything other than other calls on next-out, then
- the output is an on-line output-a new value is written on every cycle of the
- body. Otherwise the output is off-line.
-
- The following example shows how producing could be used to split a series into
- two parts. Items are read in one at a time and tested. Depending on the test,
- they are written to one of two outputs. Note the use of labels and branches to
- keep the calls on next-out at top level. Both outputs are off-line. The first
- example above shows an on-line output.
-
- (producing (items-1 items-2) ((items #Z(1 -2 3 -4)) item)
- (loop
- (tagbody (setq item (next-in items (terminate-producing)))
- (if (not (plusp item)) (go D))
- (next-out items-1 item)
- (go F)
- D (next-out items-2 item)
- F )))
- => #Z(1 3) and #Z(-2 -4)
-
- The fourth limitation is that the calls on next-in associated with an input
- variable v must appear at top level in the tagbody in the body, nested in
- assignments of the form (setq var (next-in v ...)). They cannot be nested in
- other forms. In addition, an input variable can be the source for at most one
- call on next-in and if it is the source for a next-in, it cannot be used in any
- other way.
-
- If the call on next-in for a given input has as its sole termination action
- (terminate-producing) and appears in the initial part of the tagbody in the
- body, before anything other than similar calls on next-in, then the input is an
- on-line input-a new value is read on every cycle of the body. Otherwise the
- input is off-line.
-
- The example below shows how producing could be used to concatenate two series.
- To start with, elements are read from the first input series. When this runs
- out, a flag is set and reading begins from the second input. Both inputs are
- off-line. (Compare this to the example above, which shows an on-line input.)
-
- (producing (items) ((item-1 #Z(1 2))
- (item-2 #Z(3 4))
- (in-2 nil)
- item)
- (loop
- (tagbody (if in-2 (go D))
- (setq item (next-in item-1 (setq in-2 t) (go D)))
- (go F)
- D (setq item (next-in item-2 (terminate-producing)))
- F (next-out items item))))
- => #Z(1 2 3 4)
-
- [Macro]
- terminate-producing
-
- This form (which takes no arguments) is used to terminate execution of (the
- expansion of) the producing macro.
-
- As with the form go, terminate-producing does not return any values; rather,
- control immediately leaves the current context.
-
- The form terminate-producing is allowed to appear only in a producing body and
- causes the termination of the enclosing call on producing.
-
- [Declaration specifier]
- propagate-alterability
-
- The declaration specifier (propagate-alterability input output) indicates that
- attempts to alter an element of output should be satisfied by altering the
- corresponding element of input. (The corresponding element of input is the one
- most recently read at the moment when the output element is written.)
-
- This declaration may appear only in a call on producing. The input and output
- arguments must be an input and an output, respectively, of the producing macro.
- The example below shows how the propagation of alterability could be supported
- in a simplified version of until.
-
- (defun simple-until (bools items)
- (declare (optimizable-series-function))
- (producing (z) ((x bools) (y items) bool item)
- (declare (propagate-alterability y z))
- (loop
- (tagbody
- (setq bool (next-in x (terminate-producing)))
- (setq item (next-in y (terminate-producing)))
- (if bool (terminate-producing))
- (next-out z item)))))
-
- [Macro]
- encapsulated encapsulating-fn scanner-or-collector
-
- Some of the features provided by Common Lisp are supported solely by
- encapsulating forms. For example, there is no way to specify a cleanup
- expression that will always be run, even when an abort occurs, without using
- unwind-protect. encapsulated makes it possible to take advantage of forms such
- as unwind-protect when defining a series function.
-
- encapsulated specifies a function that places an encapsulating form around the
- computation performed by its second argument. The first argument must be a
- quoted function that takes a Lisp expression and wraps the appropriate
- encapsulating form around it, returning the resulting code. The second input
- must be a literal call on scan-fn, scan-fn-inclusive, or collect-fn. The second
- argument can count on being evaluated in the scope of the encapsulating form.
- The values returned by the second argument are returned as the values of
- encapsulated. The following shows how encapsulated could be used to define a
- simplified version of collect-file.
-
- (defun collect-file-wrap (file name body)
- `(with-open-file (,file ,name :direction :output) ,body))
- (defmacro simple-collect-file (name items)
- (let ((file (gensym)))
- `(encapsulated #'(lambda (body)
- (collect-file-wrap ',file ',name body))
- (collect-fn t #'(lambda () t)
- #'(lambda (state item)
- (print item ,file)
- state)
- ,items))))
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
-